In empirical work in economics it is common to report standard errors thataccount for clustering of units. Typically, the motivation given for theclustering adjustments is that unobserved components in outcomes for unitswithin clusters are correlated. However, because correlation may occur acrossmore than one dimension, this motivation makes it difficult to justify whyresearchers use clustering in some dimensions, such as geographic, but notothers, such as age cohorts or gender. It also makes it difficult to explainwhy one should not cluster with data from a randomized experiment. In thispaper, we argue that clustering is in essence a design problem, either asampling design or an experimental design issue. It is a sampling design issueif sampling follows a two stage process where in the first stage, a subset ofclusters were sampled randomly from a population of clusters, while in thesecond stage, units were sampled randomly from the sampled clusters. In thiscase the clustering adjustment is justified by the fact that there are clustersin the population that we do not see in the sample. Clustering is anexperimental design issue if the assignment is correlated within the clusters.We take the view that this second perspective best fits the typical setting ineconomics where clustering adjustments are used. This perspective allows us toshed new light on three questions: (i) when should one adjust the standarderrors for clustering, (ii) when is the conventional adjustment for clusteringappropriate, and (iii) when does the conventional adjustment of the standarderrors matter.
展开▼